-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BeitForSemanticSegmentation #14096
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this! You have two wiles in the PR by mistake I believe:
- src/transformers/model/beit/test.ipynb
- src/transformers/model/beit/test_semantic.py
bias=bias, | ||
dilation=dilation, | ||
) | ||
self.bn = nn.BatchNorm2d(out_channels) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this is the first module I see adding a BatchNorm layer, so if used in the Trainer
, we should probably add stuff to handle the weight decay.
@NielsRogge |
Hi @kamalkraj, Yes I do plan to add that. However, this will become easier once the Image feature will be available in the Datasets library. |
Okay @NielsRogge I can update the flax version of beit, after this PR merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thanks for working on this @NielsRogge
fc42e93
to
e61b580
Compare
What does this PR do?
This PR is a follow-up of #12994, and adds the semantic segmentation head of BEiT. It's the state-of-the-art model currently for semantic segmentation (i.e. the task of labeling each pixel of an image), on datasets like ADE20k and CityScapes (see this chart on paperswithcode).
Now it's easily available with a HuggingFace API! :)
Models are on the hub: https://huggingface.co/models?search=ade-640
Here's a notebook for quick inference: https://colab.research.google.com/drive/1AS3z0plOhWWibBvDgsQkRbfJB73vSsR3?usp=sharing